Automatic Keyword Extraction Using Domain Knowledge
نویسندگان
چکیده
Documents can be assigned keywords by frequency analysis of the terms found in the document text, which arguably is the primary source of knowledge about the document itself. By including a hierarchically organised domain speciic thesaurus as a second knowledge source the quality of such keywords was improved considerably, as measured by match to previously manually assigned keywords.
منابع مشابه
Improved Automatic Keyword Extraction Based on TextRank Using Domain Knowledge
Keyword extraction of scientific articles is beneficial for retrieving scientific articles of a certain topic and grasping the trend of academic development. For the task of keyword extraction for Chinese scientific articles, we adopt the framework of selecting keyword candidates by Document Frequency Accessor Variety(DF-AV) and running TextRank algorithm on a phrase network. To improve domain ...
متن کاملAutomatic keyword extraction using Latent Dirichlet Allocation topic modeling: Similarity with golden standard and users' evaluation
Purpose: This study investigates the automatic keyword extraction from the table of contents of Persian e-books in the field of science using LDA topic modeling, evaluating their similarity with golden standard, and users' viewpoints of the model keywords. Methodology: This is a mixed text-mining research in which LDA topic modeling is used to extract keywords from the table of contents of sci...
متن کاملA Document Content Extraction Model Using Keyword Correlation Analysis
Owing to the drastic development of the information and Internet technologies, large amount of information and documents can be easily accessed through the electronic network. In addition to the efficiency of document acquisition, another typical issue for document management is the document content extraction. In order to provide the critical contents of a document to the knowledge requester, ...
متن کاملBIOTEX: A system for Biomedical Terminology Extraction, Ranking, and Validation
Term extraction is an essential task in domain knowledge acquisition. Although hundreds of terminologies and ontologies exist in the biomedical domain, the language evolves faster than our ability to formalize and catalog it. We may be interested in the terms and words explicitly used in our corpus in order to index or mine this corpus or just to enrich currently available terminologies and ont...
متن کاملA Knowledge-Base Oriented Approach for Automatic Keyword Extraction
Automatic keyword extraction is an important subfield of information extraction process. It is a difficult task, where numerous different techniques and resources have been proposed. In this paper, we propose a generic approach to extract keyword from documents using encyclopedic knowledge. Our two-step approach first relies on a classification step for identifying candidate keywords followed b...
متن کامل